k-NN Regression on Functional Data with Incomplete Observations

نویسندگان

  • Sashank J. Reddi
  • Barnabás Póczos
چکیده

In this paper we study a general version of regression where each covariate itself is a functional data such as distributions or functions. In real applications, however, typically we do not have direct access to such data; instead only some noisy estimates of the true covariate functions/distributions are available to us. For example, when each covariate is a distribution, then we might not be able to directly observe these distributions, but it can be assumed that i.i.d. sample sets from these distributions are available. In this paper we present a general framework and a kNN based estimator for this regression problem. We prove consistency of the estimator and derive its convergence rates. We further show that the proposed estimator can adapt to the local intrinsic dimension in our case and provide a simple approach for choosing k. Finally, we illustrate the applicability of our framework with numerical experiments.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rates of Uniform Consistency for k-NN Regression

We derive high-probability finite-sample uniform rates of consistency for k-NN regression that are optimal up to logarithmic factors under mild assumptions. We moreover show that kNN regression adapts to an unknown lower intrinsic dimension automatically. We then apply the k-NN regression rates to establish new results about estimating the level sets and global maxima of a function from noisy o...

متن کامل

Functional Modeling of Iranian Precipitation Based on Temperature and Humidity

Functional Data Analysis (FDA) has recently made considerable progress because of easier access to the data that are essentially in the form of curves. Modeling of Iranian precipitation based on temperature and humidity with continuous the essential nature of such phenomena that are continuous functions of time has not been done properly. The corresponding data are generally collected daily or ...

متن کامل

The roles of nearest neighbor methods in imputing missing data in forest inventory and monitoring databases

Almost universally, forest inventory and monitoring databases are incomplete, ranging from missing data for only a few records and a few variables, common for small land areas, to missing data for many observations and many variables, common for large land areas. For a wide variety of applications, nearest neighbor (NN) imputation methods have been developed to fill in observations of variables...

متن کامل

Comparing K Nearest Neighbours Methods and Linear Regression - Is There Reason To Select One Over the Other?

Non-parametric k nearest neighbours (k-nn) techniques are increasingly used in forestry problems, especially in remote sensing. Parametric regression analysis has the advantage of well-known statistical theory behind it, whereas the statistical properties of k-nn are less studied. In this study, we compared the relative performance of k-nn and linear regression in an experiment. We examined the...

متن کامل

Prediction of the waste stabilization pond performance using linear multiple regression and multi-layer perceptron neural network: a case study of Birjand, Iran

Background: Data mining (DM) is an approach used in extracting valuable information from environmental processes. This research depicts a DM approach used in extracting some information from influent and effluent wastewater characteristic data of a waste stabilization pond (WSP) in Birjand, a city in Eastern Iran. Methods: Multiple regression (MR) and neural network (NN) models were examined u...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014